Goto

Collaborating Authors

 relative importance


Learning Best Combination for Efficient N: MSparsity

Neural Information Processing Systems

By forcing at most N out of M consecutive weights to be non-zero, the recent N:M network sparsity has received increasing attention for its two attractive advantages: 1) Promising performance at a high sparsity.




Left Leaning Models: How AI Evaluates Economic Policy?

arXiv.org Artificial Intelligence

Would artificial intelligence (AI) cut interest rates or adopt conservative monetary policy? Would it deregulate or opt for a more controlled economy? As AI use by economic policymakers, academics, and market participants grows exponentially, it is becoming critical to understand AI preferences over economic policy. However, these preferences are not yet systematically evaluated and remain a black box. This paper makes a conjoint experiment on leading large language models (LLMs) from OpenAI, Anthropic, and Google, asking them to evaluate economic policy under multi-factor constraints. The results are remarkably consistent across models: most LLMs exhibit a strong preference for high growth, low unemployment, and low inequality over traditional macroeconomic concerns such as low inflation and low public debt. Scenario-specific experiments show that LLMs are sensitive to context but still display strong preferences for low unemployment and low inequality even in monetary-policy settings. Numerical sensitivity tests reveal intuitive responses to quantitative changes but also uncover non-linear patterns such as loss aversion.


Regularization Learning Networks: Deep Learning for Tabular Datasets

Neural Information Processing Systems

Despite their impressive performance, Deep Neural Networks (DNNs) typically underperform Gradient Boosting Trees (GBTs) on many tabular-dataset learning tasks. W e propose that applying a different regularization coefficient to each weight might boost the performance of DNNs by allowing them t o make more use of the more relevant inputs. However, this will lead to an int ractable number of hyperparameters. Here, we introduce Regularization Learning Networks (RLNs), which overcome this challenge by introducing an efficient hy perparameter tuning scheme which minimizes a new Counterfactual Loss . Our results show that RLNs significantly improve DNNs on tabular datasets, and achieve comparable results to GBTs, with the best performance achieved with an ensemble that combines GBTs and RLNs. RLNs produce extremely sparse networks, elim inating up to 99 .


Oh That Looks Familiar: A Novel Similarity Measure for Spreadsheet Template Discovery

arXiv.org Artificial Intelligence

Traditional methods for identifying structurally similar spreadsheets fail to capture the spatial layouts and type patterns defining templates. To quantify spreadsheet similarity, we introduce a hybrid distance metric that combines semantic embeddings, data type information, and spatial positioning. In order to calculate spreadsheet similarity, our method converts spreadsheets into cell-level embeddings and then uses aggregation techniques like Chamfer and Hausdorff distances. Experiments across template families demonstrate superior unsupervised clustering performance compared to the graph-based Mondrian baseline, achieving perfect template reconstruction (Adjusted Rand Index of 1.00 versus 0.90) on the FUSTE dataset. Our approach facilitates large-scale automated template discovery, which in turn enables downstream applications such as retrieval-augmented generation over tabular collections, model training, and bulk data cleaning.


Sentiment Matters: An Analysis of 200 Human-SAV Interactions

arXiv.org Artificial Intelligence

Shared Autonomous Vehicles (SAVs) are likely to become an important part of the transportation system, making effective human-SAV interactions an important area of research. This paper introduces a dataset of 200 human-SAV interactions to further this area of study. We present an open-source human-SAV conversational dataset, comprising both textual data (e.g., 2,136 human-SAV exchanges) and empirical data (e.g., post-interaction survey results on a range of psychological factors). The dataset's utility is demonstrated through two benchmark case studies: First, using random forest modeling and chord diagrams, we identify key predictors of SAV acceptance and perceived service quality, highlighting the critical influence of response sentiment polarity (i.e., perceived positivity). Second, we benchmark the performance of an LLM-based sentiment analysis tool against the traditional lexicon-based TextBlob method. Results indicate that even simple zero-shot LLM prompts more closely align with user-reported sentiment, though limitations remain. This study provides novel insights for designing conversational SAV interfaces and establishes a foundation for further exploration into advanced sentiment modeling, adaptive user interactions, and multimodal conversational systems.



Feature-based analysis of oral narratives from Afrikaans and isiXhosa children

arXiv.org Artificial Intelligence

Oral narrative skills are strong predictors of later literacy development. This study examines the features of oral narratives from children who were identified by experts as requiring intervention. Using simple machine learning methods, we analyse recorded stories from four- and five-year-old Afrikaans- and isiXhosa-speaking children. Consistent with prior research, we identify lexical diversity (unique words) and length-based features (mean utterance length) as indicators of typical development, but features like articulation rate prove less informative. Despite cross-linguistic variation in part-of-speech patterns, the use of specific verbs and auxiliaries associated with goal-directed storytelling is correlated with a reduced likelihood of requiring intervention. Our analysis of two linguistically distinct languages reveals both language-specific and shared predictors of narrative proficiency, with implications for early assessment in multilingual contexts.


Transformers are Graph Neural Networks

arXiv.org Artificial Intelligence

We establish connections between the Transformer architecture, originally introduced for natural language processing, and Graph Neural Networks (GNNs) for representation learning on graphs. We show how Transformers can be viewed as message passing GNNs operating on fully connected graphs of tokens, where the self-attention mechanism capture the relative importance of all tokens w.r.t. each-other, and positional encodings provide hints about sequential ordering or structure. Thus, Transformers are expressive set processing networks that learn relationships among input elements without being constrained by apriori graphs. Despite this mathematical connection to GNNs, Transformers are implemented via dense matrix operations that are significantly more efficient on modern hardware than sparse message passing. This leads to the perspective that Transformers are GNNs currently winning the hardware lottery.